ParaShares: Finding the Important Basic Blocks in Multithreaded Programs
نویسندگان
چکیده
Understanding and optimizing multithreaded execution is a significant challenge. Numerous research and industrial tools debug parallel performance by combing through program source or thread traces for pathologies including communication overheads, data dependencies, and load imbalances. This work takes a new approach: it ignores any underlying pathologies, and focuses instead on pinpointing the exact locations in source code that consume the largest share of execution. Our new metric, ParaShares, scores and ranks all basic blocks in a program based on their share of parallel execution. For the eight benchmarks examined in this paper, ParaShare rankings point to just a few important blocks per application. The paper demonstrates two uses of this information, exploring how the important blocks vary across thread counts and input sizes, and making modest source code changes (fewer than 10 lines of code) that result in 14-92% savings in parallel program runtime.
منابع مشابه
Leveraging the Potential of Control-Flow Error Resilient Techniques in Multithreaded Programs
This paper presents a software-based technique to recover control-flow errors in multithreaded programs. Control-flow error recovery is achieved through inserting additional instructions into multithreaded program at compile time regarding to two dependency graphs. These graphs are extracted to model control-flow and data dependencies among basic blocks and thread interactions between different...
متن کاملThe Common Case in Forth Programs
Identifying common features in Forth programs is important for those designing Forth machines and optimisers. In this paper we measure the behaviour of six large Forth programs and four small ones. We look at the ratio of user to system code, basic block lengths, common instructions, and common sequences of instructions. Our most important finding is that for most large programs, many (38.4%– 4...
متن کاملA Decoupled Scheduled Dataflow Multithreaded Architecture
In this paper we describe a new approach to designing multithreaded uniprocessors that can be used as the basic building blocks in high-end computing architectures. Our innovative design is a non-blocking multithreaded architecture where all memory accesses are decoupled from the thread execution. Data is pre-loaded into the thread context (registers), and all results are poststored after the c...
متن کاملComparing Execution Performance of Scheduled Dataflow With RISC Processors
In this paper we describe a new approach to designing multithreaded architecture that can be used as the basic building blocks in high-end computing architectures. Our architecture uses non-blocking multithreaded model based on dataflow paradigm. In addition, all memory accesses are decoupled from the thread execution. Data is pre-loaded into the thread context (registers), and all results are ...
متن کاملVisualizing massively multithreaded applications with ThreadScope
As highly parallel multicore machines become commonplace, programs must exhibit more concurrency to exploit the available hardware. Many multithreaded programming models already encourage programmers to create hundreds or thousands of short-lived threads that interact in complex ways. Programmers need to be able to analyze, tune, and troubleshoot these large-scale multithreaded programs. To add...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014